14 May 2018

Hello

  • Who's used R?
  • Who's used RMarkdown?

This talk is for beginners.

It isn't about how to code.

It's about improving what we do.

The problem is the process

The Excel-Word approach is looooong and error-prone. Exaggerated example:

  1. Receive non-machine-readable data in Excel format
  2. Use a pre-prepared Excel template to reformat the data
  3. Generate figures, tables and plots
  4. Copy and paste to ~100-page Word document
  5. Find error/need to alter something
  6. Fix it
  7. Spend ages re-copy-pasting the updated bits
  8. Get confused about the current working version
  9. Repeat steps 5 to 7 until 4am

Is this reproducible?

Questions to ask yourself:

  • are the steps recorded?
  • what's the chance of human error (e.g. bad copy-paste)?
  • is it easy to keep track of changes?
  • could someone else reproduce it from scratch independently?

Minimise the risk

We can reduce error and go faster with R.

  1. Receive machine-readable data in Excel format
  2. Run a pre-prepared file written with R Markdown
  3. Make minor tweaks if needed and re-render stress-free
  4. Use the saved time to prepare more informative commentary and have a cup of tea

What's an R Markdown file?

A document with filetype .Rmd in which you:

  • write plain text that's 'marked-up' with symbols (*, [], ^, etc)
  • embed R code to read, process and visualise data
  • click a button to render it to a HTML, Word or PDF
This is already being done in the department and across government (Reproducible Analytical Pipelines).

What does it look like?

RStudio window showing example R Markdown file on the left and the rendered output on the right

What does it look like?

RStudio window showing example R Markdown file on the left ('header', 'text' and 'code chunk' highlighted) and the rendered output on the right

Header

Body text: formatting

You type:

*Italic*, **bold** super^script^ and a [link](www.gov.uk).

You get:

Italic, bold, superscript and a link.

Body text: inline code (maths)

You can write some R code in the middle of a sentence!

Wrap the code in backticks (the button under the Esc key) and start it with the letter r.

You type:

The answer to 1 + 1 is `r 1 + 1`

You get:

The answer to 1 + 1 is 2

Body text: inline code (stored values)

Let's say my_value <- "Lotad"

You type:

The best Pokemon is `r my_value`

You get:

The best Pokemon is Lotad

Code chunks: input

You type:

Here's an important plot of chick weights and feed types.

```{r chicks}
chickwts %>%
  group_by(feed) %>% 
  ggplot() +
  geom_col(aes(x = feed, y = weight))
```

Don't worry about the code, just know that it's going to produce something.

Code chunks: output

You get:

Here's an important plot of chick weights and feed types.

Simple steps

  1. In RStudio: File > New File > R Markdown
  2. Write your text and embed your code
  3. Click 'Knit' to render the document
  4. The file is output to your current working directory

DEMO TIME!

To summarise

RStudio window with 'Knit' button highlighted abnd content of RMarkdown tab shown

What's actually hapening?

Your R Markdown is rendered into 'plain' markdown by the package knitr (hence why you 'knit' to render the document), then something called pandoc converts it from markdown to your output format.

.Rmd ➡️ knitr ➡️ .md ➡️ pandoc ➡️ .html/.pdf/.docx

This isn't essential knowledge.

Further reading